Abstract: In peer-to-peer systems, large amounts of data are distributed among multiple sources. Analysis of this data and identifying clusters is a difficult task due to processing, storage, and transmission costs. In this paper, GD Cluster, a general fully decentralized clustering method, which has an ability of clustering dynamic and distributed data sets. Nodes continuously working through decentralized gossip-based communication to maintain summarized views of the data set. Distributed data mining focuses on the adaptation of data-mining algorithms for distributed computing environments. In this paper, we propose a GD Cluster, a general fully decentralized clustering method using K-Harmonic means algorithm, which is having the ability of clustering dynamic and distributed datasets. K-Harmonic Means is essentially insensitive to the initialization of the centers, so that its performance does not depend on the initialization of centers.

Keywords: Distributed systems, clustering, dynamic system, partition-based clustering, density-based clustering.